NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

AdaBB: Adaptive Barzilai-Borwein Method for Convex Optimization

https://doi.org/10.1287/moor.2024.0510

Zhou, Danqing; Ma, Shiqian; Yang, Junfeng (March 2025, Mathematics of Operations Research)

In this paper, we propose AdaBB, an adaptive gradient method based on the Barzilai-Borwein stepsize. The algorithm is line-search-free and parameter-free, and it essentially provides a convergent variant of the Barzilai-Borwein method for general convex optimization problems. We analyze the ergodic convergence of the objective function value and the convergence of the iterates for solving general convex optimization problems. Compared with existing works along this line of research, our algorithm gives the best lower bounds on the stepsize and the average of the stepsizes. Furthermore, we present extensions of the proposed algorithm for solving locally strongly convex and composite convex optimization problems where the objective function is the sum of a smooth function and a nonsmooth function. In the case of local strong convexity, we achieve linear convergence. Our numerical results also demonstrate very promising potential of the proposed algorithms on some representative examples. Funding: S. Ma is supported by the National Science Foundation [Grants DMS-2243650, CCF-2308597, CCF-2311275, and ECCS-2326591] and a startup fund from Rice University. J. Yang is supported by the National Natural Science Foundation of China [Grants 12431011 and 12371301] and the Natural Science Foundation for Distinguished Young Scholars of Gansu Province [Grant 22JR5RA223].
more » « less
Free, publicly-accessible full text available March 31, 2026
Tuning-Free Bilevel Optimization: New Algorithms and Convergence Analysis

Yang, Yifan; Ban, Hao; Huang, Minhui; Ma, Shiqian; Ji, Kaiyi (May 2025, International Conference on Learning Representations)

Bilevel optimization has recently attracted considerable attention due to its abundant applications in machine learning problems. However, existing methods rely on prior knowledge of problem parameters to determine stepsizes, resulting in significant effort in tuning stepsizes when these parameters are unknown. In this paper, we propose two novel tuning-free algorithms, D-TFBO and S-TFBO. D-TFBO employs a double-loop structure with stepsizes adaptively adjusted by the "inverse of cumulative gradient norms" strategy. S-TFBO features a simpler fully single-loop structure that updates three variables simultaneously with a theory-motivated joint design of adaptive stepsizes for all variables. We provide a comprehensive convergence analysis for both algorithms and show that D-TFBO and S-TFBO respectively require $$\mathcal{O}(\frac{1}{\epsilon})$$ and $$\mathcal{O}(\frac{1}{\epsilon}\log^4(\frac{1}{\epsilon}))$$ iterations to find an $$\epsilon$$-accurate stationary point, (nearly) matching their well-tuned counterparts using the information of problem parameters. Experiments on various problems show that our methods achieve performance comparable to existing well-tuned approaches, while being more robust to the selection of initial stepsizes. To the best of our knowledge, our methods are the first to completely eliminate the need for stepsize tuning, while achieving theoretical guarantees.
more » « less
Free, publicly-accessible full text available May 1, 2026
First-Order Federated Bilevel Learning

https://doi.org/10.1609/aaai.v39i21.34355

Yang, Yifan; Xiao, Peiyao; Ma, Shiqian; Ji, Kaiyi (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

Federated bilevel optimization (FBO) has garnered significant attention lately, driven by its promising applications in meta-learning and hyperparameter optimization. Existing algorithms generally aim to approximate the gradient of the upper-level objective function (hypergradient) in the federated setting. However, because of the nonlinearity of the hypergradient and client drift, they often involve complicated computations. These computations, like multiple optimization sub-loops and second-order derivative evaluations, end up with significant memory consumption and high computational costs. In this paper, we propose a computationally and memory-efficient FBO algorithm named MemFBO. MemFBO features a fully single-loop structure with all involved variables updated simultaneously, and uses only first-order gradient information for all local updates. We show that MemFBO exhibits a linear convergence speedup with milder assumptions in both partial and full client participation scenarios. We further implement MemFBO in a novel FBO application for federated data cleaning. Our experiments, conducted on this application and federated hyper-representation, demonstrate the effectiveness of the proposed algorithm.
more » « less
Free, publicly-accessible full text available April 11, 2026
First-Order Federated Bilevel Learning

Yang, Yifan; Xiao, Peiyao; Ma, Shiqian; Ji, Kaiyi (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

Federated bilevel optimization (FBO) has garnered significant attention lately, driven by its promising applications in meta-learning and hyperparameter optimization. Existing algorithms generally aim to approximate the gradient of the upper-level objective function (hypergradient) in the federated setting. However, because of the nonlinearity of the hypergradient and client drift, they often involve complicated computations. These computations, like multiple optimization sub-loops and second-order derivative evaluations, end up with significant memory consumption and high computational costs. In this paper, we propose a computationally and memory-efficient FBO algorithm named MemFBO. MemFBO features a fully single-loop structure with all involved variables updated simultaneously, and uses only first-order gradient information for all local updates. We show that MemFBO exhibits a linear convergence speedup with milder assumptions in both partial and full client participation scenarios. We further implement MemFBO in a novel FBO application for federated data cleaning. Our experiments, conducted on this application and federated hyper-representation, demonstrate the effectiveness of the proposed algorithm.
more » « less
Free, publicly-accessible full text available April 11, 2026
Tuning-Free Bilevel Optimization: New Algorithms and Convergence Analysis

Yang, Yifan; Ban, Hao; Huang, Minhui; Ma, Shiqian; Ji, Kaiyi (April 2025, ICLR)

Free, publicly-accessible full text available April 1, 2026
Riemannian Bilevel Optimization

Li, Jiaxiang; Ma, Shiqian (January 2025, Journal of machine learning research)

Full Text Available
Zeroth-Order Riemannian Averaging Stochastic Approximation Algorithms

https://doi.org/10.1137/23M1605405

Li, Jiaxiang; Balasubramanian, Krishnakumar; Ma, Shiqian (December 2024, SIAM Journal on Optimization)

Full Text Available
A Riemannian Alternating Direction Method of Multipliers

https://doi.org/10.1287/moor.2023.0068

Li, Jiaxiang; Ma, Shiqian; Srivastava, Tejes (December 2024, Mathematics of Operations Research)

We consider a class of Riemannian optimization problems where the objective is the sum of a smooth function and a nonsmooth function considered in the ambient space. This class of problems finds important applications in machine learning and statistics, such as sparse principal component analysis, sparse spectral clustering, and orthogonal dictionary learning. We propose a Riemannian alternating direction method of multipliers (ADMM) to solve this class of problems. Our algorithm adopts easily computable steps in each iteration. The iteration complexity of the proposed algorithm for obtaining an ϵ-stationary point is analyzed under mild assumptions. Existing ADMMs for solving nonconvex problems either do not allow a nonconvex constraint set or do not allow a nonsmooth objective function. Our algorithm is the first ADMM-type algorithm that minimizes a nonsmooth objective over manifold—a particular nonconvex set. Numerical experiments are conducted to demonstrate the advantage of the proposed method. Funding: The research of S. Ma was supported in part by the Office of Naval Research [Grant N00014-24-1-2705]; the National Science Foundation [Grants DMS-2243650, CCF-2308597, CCF-2311275, and ECCS-2326591]; the University of California, Davis Center for Data Science and Artificial Intelligence Research Innovative Data Science Seed Funding Program; and Rice University start-up fund.
more » « less
Full Text Available
Efficiently Escaping Saddle Points in Bilevel Optimization

Huang, Minhui; Chen, Xuxing; Ji, Kaiyi; Ma, Shiqian; Lai, Lifeng (January 2025, Journal of machine learning research)

Bilevel optimization is one of the fundamental problems in machine learning and optimization. Recent theoretical developments in bilevel optimization focus on finding the firstorder stationary points for nonconvex-strongly-convex cases. In this paper, we analyze algorithms that can escape saddle points in nonconvex-strongly-convex bilevel optimization. Specifically, we show that the perturbed approximate implicit differentiation (AID) with a warm start strategy finds an -approximate local minimum of bilevel optimization in ̃O(−2) iterations with high probability. Moreover, we propose an inexact NEgativecurvature-Originated-from-Noise Algorithm (iNEON), an algorithm that can escape saddle point and find local minimum of stochastic bilevel optimization. As a by-product, we provide the first nonasymptotic analysis of perturbed multi-step gradient descent ascent (GDmax) algorithm that converges to local minimax point for minimax problems.
more » « less
Full Text Available
Efficiently Escaping Saddle Points in Bilevel Optimization

Huang, Minhui; Chen, Xuxing; Ji, Kaiyi; Ma, Shiqian; Lai, Lifeng (January 2025, Journal of machine learning research)

Bilevel optimization is one of the fundamental problems in machine learning and optimization. Recent theoretical developments in bilevel optimization focus on finding the first-order stationary points for nonconvex-strongly-convex cases. In this paper, we analyze algorithms that can escape saddle points in nonconvex-strongly-convex bilevel optimization. Specifically, we show that the perturbed approximate implicit differentiation (AID) with a warm start strategy finds an ϵ-approximate local minimum of bilevel optimization in $$\tilde O(\epsilon^{-2})$$ iterations with high probability. Moreover, we propose an inexact NEgative-curvature-Originated-from-Noise Algorithm (iNEON), an algorithm that can escape saddle point and find local minimum of stochastic bilevel optimization. As a by-product, we provide the first nonasymptotic analysis of perturbed multi-step gradient descent ascent (GDmax) algorithm that converges to local minimax point for minimax problems.
more » « less
Full Text Available

« Prev Next »

Search for: All records